CoReMo System (Contextual Reference Monotony) - Lab Report for PAN at CLEF 2010

نویسندگان

  • Diego Antonio Rodríguez Torrejón
  • José Manuel Martín Ramos
چکیده

In this paper a new approach is shown for a very fast monolingual external plagiarism detection system based on an altered n-gram concept (contextual n-gram), a new high precision contextual Information Retrieval engine, and a new pruning strategy (Referential Monotony) for plagiarism detection and its limits. The assessment results can be compared with the carried out by the winner team at PAN'09, but achieved with remarkable speed (35 min) and low hardware requirements (single laptop).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Crosslingual CoReMo System (Contextual Reference Monotony) - Notebook for PAN at CLEF 2011

This paper shows an extended version of external CoReMo System (Contextual Reference Monotony, ranked 6th in PAN2010), now with crosslingual capability (ranked 5th in PAN2011 / Plagdet 0,2340). It's not the best ranked system for translated plagiarism (ranked 3th / Plagdet 0,3587), but it has high reliability and speed (global results in 30 minutes), low computer requirements and its own intern...

متن کامل

Text Alignment Module in CoReMo 2.1 Plagiarism Detector Notebook for PAN at CLEF 2013

This paper describes the process and basics of the Text Alignment Module into the CoReMo 2.1 Plagiarism Detector, which has won the Plagiarism Detection Text Alignment task in PAN-2013 edition, for both evaluation criteria of efficacy and efficiency, achieving the best detections and the best runtime too. Its high detection efficacy is mainly due to the special features of the contextual n-gram...

متن کامل

Improving the Reliability of the Plagiarism Detection System - Lab Report for PAN at CLEF 2010

In this paper we describe our approach at the PAN 2010 plagiarism detection competition. We refer to the system we have used in PAN’09. We then present the improvements we have tried since the PAN’09 competition, and their impact on the results on the development corpus. We describe our experiments with intrinsic plagiarism detection and evaluate them. We then discuss the computational cost of ...

متن کامل

External Plagiarism Detection Based on Standard IR Technology and Fast Recognition of Common Subsequences - Lab Report for PAN at CLEF 2010

The plagiarism detection system described in this paper is aiming at bringing external plagiarism detection to the desktop. The main ideas are to incorporate standard IR technologies for the candidate selection and efficient data structures for the detailed analysis between a suspicious and a candidate document. Given that the system so far has only reached prototype status, the first results l...

متن کامل

CoReMo 2.3 Plagiarism Detector Text Alignment Module - Notebook for PAN at CLEF 2014

In this paper, the basics of the three tuning approaches of the evolving CoReMo Plagiarism Detector are shown, focused for the Text Alignment task. In the last PAN edition, it was observed that the different corpora could condition the necessary tuning, and the results using an overfitted tuning from a different corpus could be far from the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010